Educative: Interactive Courses for Software Developers

The chief problem in the antipattern Readable Passwords is that the original form of the password is readable. But we can authenticate the user’s input against a password without reading it. This section describes how to implement this kind of secure password storage in an SQL database.

Understanding hash functions#

We can do this by encoding the password using a one-way cryptographic hash function. This transforms the input string into a new string, called hash, which is unrecognizable. Even the length of the original string is obscured because the hash returned by a hash function is a fixed-length string. For example, the SHA-256 algorithm converts our example password, “xyzzy”, to a 256-bit string of bits, usually represented as a 64-character string of hexadecimal digits:

SHA2('xyzzy') = '184858a00fd7971f810848266ebcecee5e8b69972c5ffaed622f5ee078671aed'

Another characteristic of a hash is that it’s not reversible. We can’t recover the input string from its hash because the hashing algorithm is designed to “lose” some information about the input. A good hashing algorithm should take as much work to crack as it would to simply guess the input through trial and error.

A popular algorithm in the past has been SHA-1, but researchers have recently proved that this 160-bit hashing algorithm has insufficient cryptographic strength so that it is possible to infer the input from a hash string. The technique to infer the encrypted string is very time-consuming but it takes less time than guessing the password by trial and error. The National Institute of Standards and Technology (NIST) announced a plan to phase out SHA-1 as an approved secure hashing algorithm in the U.S. after 2010 in favor of these stronger variants: SHA-224, SHA2 256, SHA-384, and SHA-512. Whether we need to comply with NIST standards or not, it’s a good idea to use at least SHA-256 for passwords.

The MD5() function is another popular hash function, producing hash strings of 128 bits. This function has also been shown to be cryptographically weak, so we shouldn’t use it for encoding passwords. Weaker algorithms still have been used but not for sensitive information like passwords.

Using a hash in SQL#

The following is a redefinition of the Accounts table. The SHA-256 password hash is always 64 characters long, so we define the column as a fixed-length CHAR column of that length.

Enter to Rename, Shift+Enter to Preview

Creating Accounts table to store password

Hashing functions aren’t part of the standard SQL language, so we may need to rely on our database brand to support hashing as an extension. For example, MySQL 6.0.5 with SSL support includes a function SHA2(), which returns a 256-bit hash by default.

Enter to Rename, Shift+Enter to Preview

Inserting a record in the Accounts table

We can validate a user’s input by applying the same hash function to it and comparing the result to the value stored in the database.

Enter to Rename, Shift+Enter to Preview

Trying to know if we can retrieve a record matching the password

We can also lock an account easily by changing the value in the password hash to a string the hash function can’t return. For example, the string “noaccess” contains letters that aren’t hexadecimal digits.

Adding salt to our hash#

If we store hashes instead of passwords and the attacker gains access to our database (by searching our trash for a CDROM backup, for example), they can still attempt to guess passwords by trial and error. Guessing each password may take a long time, but they can prepare their own database of hashes of likely passwords against which to compare the hash strings they find in our database. If only one user chose a password that is a word in a standard dictionary, it would be easy for them to find it by simply searching their password database for hashes that match their prepared table of hashes.

They can even do this with SQL:

The attacker would have prepared a DictionaryHashes table.

Enter to Rename, Shift+Enter to Preview

Creating DictionaryHashes table

The hacker would have a similar entry to the one shown below in their table:

Enter to Rename, Shift+Enter to Preview

Inserting data into DictionaryHashes table

The hacker would match the string in our table with the data in their table saved using SHA2.

Enter to Rename, Shift+Enter to Preview

Retrieving account details by matching password

One way to defeat this kind of “dictionary attack” is by including a “salt” in our password-encoding expression. A salt is a string of meaningless bytes we concatenate with the user’s password, before passing the resulting string to the hash function. Even if the user chose a word in the dictionary as their password, the hash produced from a salted password won’t match the hash in the attacker’s hash database. For example, if the password is the word “password”, we can see that the hash of this word is different from a hash of the word with a few random bytes appended:

SHA2('password') = '5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8'

SHA2('password-0xT!sp9') = '7256d8d7741f740ee83ba7a9b30e7ac11fcd9dbd7a0147f4cc83c62dd6e0c45b'

Each password should use a different salt value to make an attacker have to generate a new dictionary table of hashes for each password. But then the attacker is back to square one because cracking passwords in your database takes as much time as guessing them with trial and error.

CREATE TABLE Accounts (
  account_id SERIAL PRIMARY KEY,
  account_name VARCHAR(20),
  email VARCHAR(100) NOT NULL,
  password_hash CHAR(64) NOT NULL,
  salt BINARY(8) NOT NULL
);
INSERT INTO Accounts (account_id, account_name, email,
    password_hash, salt)
  VALUES (123, 'billkarwin', 'bill@example.com', 
    SHA2('xyzzy''-0xT!sp9',256), '-0xT!sp9');
SELECT (password_hash = SHA2('xyzzy' || salt, 256)) AS password_matches
FROM Accounts
WHERE account_id = 123;

Enter to Rename, Shift+Enter to Preview

Adding salt to the hash

A good salt is 8 bytes long, generated randomly for each password. The previous examples show a salt string containing printable characters, but we can (and should) make a salt using printable and unprintable bytes.

Introduction

Logical Antipattern - Jaywalking

Logical Antipattern - Naive Trees

Logical Antipattern - ID Required

Logical Antipattern - Keyless Entry

Assessment: Logical Antipatterns - Part 1

Logical Antipattern - Entity-Attribute-Value

Logical Antipattern - Polymorphic Associations

Logical Antipattern - Multicolumn Attributes

Logical Antipattern - Metadata Tribbles

Assessment: Logical Antipatterns - Part 2

Physical Antipattern - Rounding Errors

Physical Antipattern - 31 Flavors

Physical Antipattern - Phantom Files

Physical Antipattern - Index Shotgun

Assessment: Physical Antipatterns

Query Antipattern - Fear of the Unknown

Query Antipattern - Ambiguous Groups

Query Antipattern - Random Selection

Query Antipattern - Poor Man’s Search Engine

Query Antipattern - Spaghetti Query

Query Antipattern - Implicit Columns

Assessment: Query Antipattern

Application Development Antipattern - Readable Passwords

Application Development Antipattern - SQL Injection

Application Development Antipattern - Pseudokey Neat-Freak

Application Development Antipattern - See No Evil

Application Development Antipattern - Diplomatic Immunity

Application Development Antipattern - Magic Beans

Assessment: Application Development Antipattern

Conclusion

Solution: Store a Salted Hash of the Password

Understanding hash functions#

Using a hash in SQL#

Adding salt to our hash#